Generating Natural Language Attacks in a Hard Label Black Box Setting

نویسندگان

چکیده

We study an important and challenging task of attacking natural language processing models in a hard label black box setting. propose decision-based attack strategy that crafts high quality adversarial examples on text classification entailment tasks. Our proposed leverages population-based optimization algorithm to craft plausible semantically similar by observing only the top predicted target model. At each iteration, procedure allow word replacements maximizes overall semantic similarity between original text. Further, our approach does not rely using substitute or any kind training data. demonstrate efficacy through extensive experimentation ablation studies five state-of-the-art across seven benchmark datasets. In comparison attacks prior literature, we are able achieve higher success rate with lower perturbation percentage too highly restricted

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Label Contamination Attacks Against Black-Box Learning Models

Label contamination attack (LCA) is an important type of data poisoning attack where an attacker manipulates the labels of training data to make the learned model beneficial to him. Existing work on LCA assumes that the attacker has full knowledge of the victim learning model, whereas the victim model is usually a black-box to the attacker. In this paper, we develop a Projected Gradient Ascent ...

متن کامل

Witness Finding in the Black-Box Setting

We propose an abstract framework for studying search-to-decision reductions for NP. Specifically, we study the following witness finding problem: for a hidden nonempty set W ⊆ {0, 1}, the goal is to output a witness in W with constant probability by making randomized queries of the form “is Q ∩ W nonempty?” where Q ⊆ {0, 1}. Algorithms for the witness finding problem can be seen as a general fo...

متن کامل

Query-limited Black-box Attacks to Classifiers

We study black-box attacks on machine learning classifiers where each query to the model incurs some cost or risk of detection to the adversary. We focus explicitly on minimizing the number of queries as a major objective. Specifically, we consider the problem of attacking machine learning classifiers subject to a budget of feature modification cost while minimizing the number of queries, where...

متن کامل

Cube Attacks on Tweakable Black Box Polynomials

Almost any cryptographic scheme can be described by tweakable polynomials over GF (2), which contain both secret variables (e.g., key bits) and public variables (e.g., plaintext bits or IV bits). The cryptanalyst is allowed to tweak the polynomials by choosing arbitrary values for the public variables, and his goal is to solve the resultant system of polynomial equations in terms of their commo...

متن کامل

Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN

Machine learning has been used to detect new malware in recent years, while malware authors have strong motivation to attack such algorithms.Malware authors usually have no access to the detailed structures and parameters of the machine learning models used by malware detection systems, and therefore they can only perform black-box attacks. This paper proposes a generative adversarial network (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i15.17595